EPViz: A flexible and lightweight visualizer to facilitate predictive modeling for multi-channel EEG

Scalp Electroencephalography (EEG) is one of the most popular noninvasive modalities for studying real-time neural phenomena. While traditional EEG studies have focused on identifying group-level statistical effects, the rise of machine learning has prompted a shift in computational neuroscience towards spatio-temporal predictive analyses. We introduce a novel open-source viewer, the EEG Prediction Visualizer (EPViz), to aid researchers in developing, validating, and reporting their predictive modeling outputs. EPViz is a lightweight and standalone software package developed in Python. Beyond viewing and manipulating the EEG data, EPViz allows researchers to load a PyTorch deep learning model, apply it to EEG features, and overlay the output channel-wise or subject-level temporal predictions on top of the original time series. These results can be saved as high-resolution images for use in manuscripts and presentations. EPViz also provides valuable tools for clinician-scientists, including spectrum visualization, computation of basic data statistics, and annotation editing. Finally, we have included a built-in EDF anonymization module to facilitate sharing of clinical data. Taken together, EPViz fills a much needed gap in EEG visualization. Our user-friendly interface and rich collection of features may also help to promote collaboration between engineers and clinicians.


Introduction
Scalp electroencephalography (EEG) has long been used as a window into the complex innerworkings of the human brain. Formally, EEG measures the effects of postsynaptic currents in the brain and provides real-time information about neural activity [1,2]. Its cost-effectiveness and relative ease of acquisition has made EEG ubiquitous in both research and clinical practice. To a large extent, traditional EEG analysis has focused on group-level effects. Broadly, these studies extract quantitative features from the EEG data and use statistical testing either to identify significant differences between groups or to compute the explained variance with respect to some external measure. Common features include the amplitude and timing of evoked response potentials (ERPs) [3,4], spectral power across the standard EEG frequency bands [5][6][7], quantitative metrics of the brain network organization [8][9][10], and spatial arrangement of ICA components [4,11]. One commonality across these methods is that they draw "static" conclusions at the level of an EEG channel or a brain network. Hence, visualization of these findings is straightforward. The rise of machine learning has spurred new directions in computational electrophysiology focused on time-varying and patient-specific predictive analyses. This paradigm shift has been accelerated by deep learning and platforms, such as PyTorch and TensorFlow, which make such techniques readily available to the research community. Two common application domains are epilepsy monitoring and brain computer interface (BCI) systems. Much of the work in epilepsy focuses on the problem of seizure detection. This setting is often cast as a binary classification problem, where the goal is to classify whether short windows (1-10 sec) of multi-channel EEG correspond to baseline or seizure activity [12][13][14]. The methods range from traditional machine learning algorithms applied to hand-crafted features, such as wavelet coefficients [5,[15][16][17][18][19][20][21], spectral power [6,7,[22][23][24][25][26], and non-linear measures [5,17,20,[27][28][29][30][31], to end-to-end deep neural networks based on convolutional and recurrent architectures [32][33][34][35][36][37][38][39][40][41][42][43][44]. Recent work in epilepsy has pivoted towards localizing the seizure onset from EEG, which adds a spatial component to the temporal predictions [23,45,46]. On the other hand, BCI systems try to decode user intent based on the EEG signals in order to control the environment [47]. One approach detects sensorimotor rhythms generated by motor imagery [48,49], typically by evaluating the EEG frequency content in the C3 and C4 electrodes [50]. Similarly, steady state visually evoked potentials measure stable responses to flickering visual stimuli [51]. These potentials are observed in the occipital lobe and can be detected using methods such as filterbank analysis and canonical correlation analysis [52].
Software packages for EEG can be divided into two categories. The first category focuses on specific analytical techniques, with the visualization options for each package highly targeted towards the method under consideration. Examples include EEGLab [53,54], which is geared towards ERP analysis, EEGNet [55], which emphasizes brain connectivity and network analyses, and BrainStorm [56], which tries to link multimodal information in a common reference space. While these software packages represent seminal contributions to the field, none of them are geared towards viewing the results of time-varying and spatially-varying predictive analyses. The second category of software includes EEG viewers that display and manipulate the raw time series data. The most popular viewer is EDFBrowser [57], which provides a wide range of preprocessing, display, and annotation functionalities. While EDFBrowser is and will remain a valuable resource to the community, it has some notable limitations. For example, the large number of tools makes the interface clunky and difficult to navigate. In addition, EDF Browser does not have native support for visualizing model predictions, a need that is growing in popularity with machine learning analyses.
In this paper, we introduce the EEG Prediction Visualizer (EPViz), a lightweight and flexible EEG viewer that complements existing software resources in the field. EPViz is targeted towards machine learning applications and is built around four core functionalities: (1) displaying and manipulating the multi-channel EEG time series, (2) running PyTorch deep learning models on the data, (3) overlaying channel-wise and time-varying predictions on top of the EEG time series, and (4) saving high-quality images of the results. In addition, EPViz includes basic preprocessing operations, spectral feature extraction, and annotation editing. Finally, EPViz has a built-in anonymizer to facilitate sharing of clinical EEG data between clinicians and engineers. EPViz is freely available for download at https://engineering.jhu.edu/ nsa/links/.

Materials and methods
EPViz is a streamlined viewer designed for predictive modeling applications. EPViz is built using the PyQt package (5.15.4) in Python. PyQt allows for easy integration with a range of Python deep learning and machine learning libraries.
The multi-channel EEG data is plotted using the PyQtGraph package, which provides fast updating and real-time user interaction capabilities. The PyEDFlib package is used for loading EDF files, and the Matplotlib package is used for saving high-quality images. Finally, the MNE package [58] is used to generate a 2-D topographic map of channel-wise model predictions on the scalp for enhanced visualization capabilities. This representation is also known as a topoplot. Fig 1 illustrates the EPViz graphical user interface. The "Select File" button allows the user to load an EDF file containing multi-channel EEG data. The popup window asks the user to select which channels to plot. We have included the standard 10-10, 10-20 and bipolar 10-20 montages as preset selections. The user also has the option to load a custom EEG montage via a separate text file. The EEG signals appear in the main display pane. Signals from the default montages are color-coded according to hemisphere (red for left, blue for right, and green for the midline). This is in contrast to EDFBrowser, which defaults to plotting all signals in black. Users can change the ordering and number of plotted signals using the "Change Signals" button. Annotations in the EDF files are plotted as "Notes" at the bottom of the display pane. These are particularly relevant for clinical EEG data. Users can vary the time scale of the plot (1,5,10,20,25,30, or 45 seconds) using the "Change Window Size" button. Likewise, they can change the intensity scale via the "Change Amplitude" button. Finally, the "Open Zoom" button allows the user to zoom in on a selected region of the plotting window.

Overview of the GUI
EPViz includes basic filtering operations. The high-and low-pass parameters, implemented using the SciPy library, can be set in the "Change Filter" pop-up. To allow for real-time updating, only the region shown on the screen is filtered. These filtering operations mimic those used in epilepsy and BCI applications. More complex preprocessing, such as ICA, should be done offline prior to loading the file into EPViz.
EPViz also plots the spectrogram of a selected EEG channel. The spectrogram is extracted based on the Fast Fourier Transform magnitude. This time-frequency representation is popular in many EEG applications [6,7,[22][23][24][25][26]. Users can toggle the spectrogram via the "Power Spectrum" button. EPViz computes and displays basic statistics of the EEG data. These include signal mean, variance, and line length. As its name suggests, line length is computed as the sum of distances between consecutive time points of the signal; it is a particularly useful metric in EEG analysis. Beyond these time-domain features, EPViz computes the power within the standard EEG frequency bands: delta (1-4 Hz), theta (4-8 Hz), alpha (8)(9)(10)(11)(12)(13)(14), beta (14-30 Annotation editor. The EDF file format provides a means to store text annotations linked to specific time points of the data. Annotations are often added by clinicians to flag salient events, for example during epilepsy monitoring. The are also useful in research studies to indicate the timing of different stimuli or experimental conditions. Similar to EDFBrowser, our EPViz includes tools for extracting and modifying textual annotations. Specifically, the annotations are displayed both as a list on the right of the window and in the "Notes" row of the main display pane. As seen in Fig 3, we have also included an annotation editor that lets the user both modify the text of existing annotations and add new annotations. Changes made using the annotation editor will not persist outside of EPViz unless they are saved into a new EDF file. Clinical anonymization. To facilitate the sharing of clinical data, EPViz includes a builtin anonymizer to strip identifiable information from the EDF file prior to it being saved. Fig 4 shows the annonymization window. Here, users can opt for the default setting, which removes names and dates from the EDF header, or they can selectively edit each field themselves. The Python code underlying the anonymizer has been verified by the University of Wisconsin (UW) Madison Institutional Review Board and is currently being used for data sharing between UW Madison and Johns Hopkins University.
For convenience, the EEG data and the overlaid model predictions can be saved into a single EDF file. This file can easily be re-loaded into the EPViz for further analysis and

PLOS ONE
visualization. Logistically, the model predictions are stored as a new data channel and should not interfere with other EEG software packages.
Command line options. We have provided command-line support for figure generation of the main display pane (EEG signals) and for data anonymization. These command-line option can be integrated into batch processing pipelines and are particularly useful for comparing different predictive modeling outputs. In summary, EPViz is built so that people with and without technical expertise are able to easily interact with its tools.

Development and release
Software testing procedures. To ensure a smooth user experience, we have added unit testing for EPViz. Our tests cover the main functionalities of the visualizer (e.g., loading, plotting, manipulating, and saving data/images) with extensive coverage of the corresponding source files. Exceptions included purely UI functionalities, which require constant user interaction. Table 1 shows the code coverage of our unit tests. We have also created a GitHub Action to run the unit tests on each pull request to encourage high-quality code integration.
Finally, we have used Pylint (https://pylint.pycqa.org/en/latest/) during the development process to ensure that our EPViz source code conforms to the PEP 8 style guidelines for Python.
Software dissemination. We have included three ways for users to download and install EPViz. First, users can clone our GitHub repository, which contains the most up-to-date version of the code. The repository includes information for developers about how to use EPViz along with test EDF files from the public Children's Hospital of Boston (CHB) and Temple University Hospital (TUH) datasets [25,59,60] that can be used to explore the visualizer functionality. The GitHub repository is linked on our lab webpage: https://engineering.jhu. edu/nsa/.
Second, EPViz is available on PyPI at https://pypi.org/project/EPViz. This page provides instructions on how to install EPViz in Linux, MacOS and Windows, links to our online documentation, and a summary of features and command-line options. There is also a description of the unit tests created for EPViz and instructions for running pylint on any code modifications to ensure compatibility.
Third, EPViz can be downloaded as a standalone package for MacOS and Windows. This option is geared towards users with limited programming experience, who simply want to access the functionalities of EPViz. These packages are available for download at https:// engineering.jhu.edu/nsa/.
EPViz is licensed under General Public License (GPL) 3.0.

PLOS ONE
Online documentation. We have created an extensive online documentation page for EPViz, which can be accessed via our lab website at: https://engineering.jhu.edu/nsa/epviz/. The documentation includes a short video of EPViz, followed by detailed information about each of the main features and tips to help users interact with the visualizer. The user can also download the test EDF files in our GitHub repository and follow along with a step-by-step demo at the end of the page. This demo reviews a common use case for EPViz including loading a file, selecting a montage, loading predictions, navigating through the signal time series, saving a figure, and anonymizing the file. These steps are also covered in the linked video.

Data
We demonstrate the real-world utility of EPViz via a seizure detection experiment. Our scalp

Experimental setup
Predictive models. We compare the performance of eight different methods for seizure detection drawn from recently published work. These methods encompass a range of feature extraction and machine learning techniques. The inputs to each method are one-second windows of multichannel EEG data, provided as a sequence. The outputs are window-level predictions of seizure versus baseline activity. Here, we provide a concise summary of each method and refer the reader to the citations below for additional details.
• CNN-BLSTM: Introduced in [32], this deep learning architecture couples a convolutional neural network (CNN) feature encoder with a recurrent bidirectional long short-term memory (BLSTM) classifier.
• CNN-MLP: Proposed as a baseline for the CNN-BLSTM [32], this method uses the same convolutional encoder, but replaces the BLSTM with a multi-layer perceptron that operates independently on each one-second window of the EEG.
• Wei-CNN: Developed by [41], this is a fully-convolutional deep learning method that uses a single CNN to obtain window-wise seizure versus baseline predictions.
• CNN-2D: Also used a baseline in [32], this method concatenates the FFT of the channelwise EEG signals into a 2D matrix and operates on it like an image. This method is inspired by the works of [34,37], which use a similar strategy.
• MLP-XXX: These three methods rely on hand-crafted features extracted channel-wise from the one-second windows as described in [45]. The "time" features consist of sample entropy, signal energy, line length, and largest Lyapunov exponent. The "filterbank" features consist of spectral power in different EEG frequency bands. The classification is performed by a multi-layer perceptron.
• Kaleem-SVM: As introduced in [18], this method operates on the combined time and filterbank features described above but uses a support vector machine (SVM) classifier instead of a deep neural network.
Training and calibration is performed according to [32]. For the CNN-BLSTM, the CNN encoder is pre-trained on individual windows for 10 epochs prior to joint training of the full architecture. The outputs for each method are averaged 20 consecutive windows to reduce noise in the final predictions. Likewise, the seizure detection thresholds for each method are independently set to allow only two minutes of false positive seizure classifications per hour on the training data.
Finally, we note that our objective in this study is to demonstrate how EPViz can be used to visualize the results of a real-world predictive analysis, rather than to advocate for any particular seizure detection method. Thus, we have selected models that are simple to implement and train, while still being current in the field.
Performance metrics. We evaluate performance at the level of one-second EEG windows and at the level of whole seizures. In the former case, we treat the window-level seizure versus baseline predictions as independent outputs and compute the Area Under the Receiver Operating Characteristic Curve (AUC-ROC) and the Area Under the Precision-Recall Curve (AUC-PR). These metrics capture the behavior when varying the detection threshold. We also compute sensitivity and specificity using the detection thresholds obtained during callibration (see previous section). In the latter (whole seizure) case, we first determine the intervals of contiguous seizure classification based on the callibrated detection thresholds. Any interval that intersect a clinician-annotated seizure is considered a true positive detection; the remaining are considered false positive detections. From here, we quantify the duration of false positive detections, the sensitivity (true positives divided by total number of seizures), and the onset latency.
Experimental results. Fig 6 reports the seizure detection results. The box plots are constructed by first averaging the performance metrics across all seizures for a given patient and then plotting the distribution of these averages across patients. Note that the CNN-BLSTM and CNN-MLP have the best median performance at the window level, which suggests that the convolutional encoder is key to learning discriminative representations from the data. At the seizure level, while the deep learning methods achieve similar sensitivity and onset latency, the CNN-BLSTM shines with respect to low false positives. This observation indicates that (1) the calibration thresholds set during training are more likely to generalize to testing data for the CNN-BLSTM, and (2) the temporal modeling of this architecture may help to suppress less certain seizure predictions. Researchers can also use these images to identify confounding features in the underlying EEG signals, then can be re-incorporated into their models for improved performance.
Finally, we note that the seizure detection performance in the UWM dataset is lower than the performance of the models reported in [32] using the CHB dataset [25,26]. This trend may be partially attributed to the smaller sample size of the UWM dataset. In addition, the CHB dataset contains primarily generalized seizures, whereas the UWM dataset contains a more heterogeneous focal epilepsy cohort.

Comparison between EPViz and existing tools
EPViz provides a streamlined interface that allows users to easily interact with the EEG data and visualize model predictions. Accordingly, EPViz fills an unoccupied niche amongst other EEG visualization and analyses tools. Perhaps the most widely used too is EDF Browser [57], which provides similar EDF loading and manipulation capabilities. EDF Browser also has a large suite of analytical tools, from filtering operations to spectral analysis [57]. While EDF Browser will remain a powerhouse in the EEG community, we believe that EPViz provides key functionality not incorporated in EDF Browser. First, EPViz is a streamlined and user-friendly application, which makes it easy for non-technical users to adopt. Second, it is geared towards spatio-temporal predictive analyses; the predictive overlay and topoplots are currently not

PLOS ONE
supported in EDF Browser. Finally, it generates high-quality images of the data and results, which can be used in scientific publications and presentations.
There are also a variety of EEG software toolboxes developed for Matlab. For example, the popular EEGLab focuses on independent component analysis (ICA) and other time-frequency techniques [53,54]. It also provides a GUI for visualizing events detected by these methods. In contrast, EEGNET provides tools for functional connectivity analysis [55] and includes a pipeline for the relevant preprocessing to construct an EEG connectome. Finally, BrainStorm registers multiple data formats including MEG, EEG, and MRI producing visualizations and analysis [56]. While these packages provide some overlapping functionality to EPViz, they do not lend themselves towards predictive analysis, like our PyTorch integration. They also rely on Matlab, which is an expensive and not-universally available platform. In contrast, EPViz is based on open-source Python packages and is freely available to the community. Table 2 summarizes the key features offered by EPViz, as compared to existing software packages. As seen, EPViz fills a much needed gap in EEG visualization.
Finally, EPViz leverages the MNE library [58] to produce topoplots in a user-friendly manner. While the native MNE library provides many visualization, preprocessing, and analysis tools, using them requires advanced scripting knowledge.

Application domains
EPViz can be used in a variety of clinical and research applications, where the goal is to detect an event from the EEG data. One natural domain is epilepsy. In fact, the experimental testbed in this paper uses EPViz to compare the seizure detection performance across different machine learning methods. Other phenomenon of interest are auras and non-epileptic events, both of which can be detected using a similar training and evaluation strategy. Going one step further, EPViz supports channel-wise predictions, which makes it a natural tool for seizure localization studies [45,46], where the goal is to identify a specific area of onset (e.g., lobe and/ or hemisphere) and track the seizure activity as it propagates from that location.
Another application domain is BCI. Once again, some studies try to predict subject-level events, such as viewing a particular stimuli, while others zone in on specific EEG channels to tease apart the activity.
Finally, even though we have focused this paper on predictive analytics, the features of EPViz can be used to emphasize other aspects of the data. Recall that EPViz can overlay "predictions" contained in an auxiliary file. Hence, the user can create "predictions" that correspond to different experimental conditions. Another option is to create "predictions" that select certain EEG channels and time intervals based on an ERP analysis. Thus, EPViz is a flexible tool that users can adapt for their own needs.

Limitations and future work
EPViz is currently designed to load and manipulate a single EEG recording. This setup makes it amenable to the testing phase of machine learning approaches, i.e., evaluating the performance of a model on new data. However, EPViz cannot be used for model training, which would require processing multiple EEG recordings at a time. Future work will integrate command-line options into EPViz for the user to train a PyTorch deep learning model given a data directory and subject ID list. This trained model can be loaded back into EPViz and applied to new EEG data using the existing functionality. Along the same lines, we will add an option for users to load multiple EEG recordings and toggle between them in the main display pane. From a visualization standpoint, EPViz is optimized for the 10-20 electrode placement system [61]. While the user can load data from other montages, the signals will not be colored according to hemisphere, and EPViz will not generate a topoplot since the electrode placements are not provided in the EDF file. Future work will tackle this issue by allowing the user to add the electrode positions, hemisphere information, and desired ordering to the auxiliary text file mentioned above.
Along the same lines, EPViz has difficulty displaying more than 50 EEG signals at a time. Not only are the signals difficult to see, but the plot updates much more slowly after user interaction (e.g., signal filtering, scrolling through time, zoom functionality). To address this issue, we will integrate a memory management system that allows EPViz to efficiently cache and update data as needed. We will also add a pop-out window for the main display pane to accommodate the additional EEG signals.
Finally, EPViz is configured to apply the loaded models to the entire EEG recording to obtain predictions. While suitable for research purposes, clinicians desire real-time analysis capabilities to assist in their review of continuous recordings. We will explore such add-ons to EPViz in the future. These will likely rely on memory management systems and closer integration with existing clinical software packages.

Conclusion
We have introduced EPViz, a lightweight and user-friendly visualizer for EEG data. EPViz is designed for predictive modeling applications, which are becoming increasingly popular in EEG research. Specifically, EPViz allows the user to generate and overlay predictions on top of the EEG signals, thus providing a mechanism to interpret the model output with respect to the data. EPViz can also generate high-quality images of the predictive modeling outputs to aid in scientific reporting [62]. EPViz is completely open-source and uses Python, which is the fastest-growing programming language for machine learning. Finally, EPViz has been designed for both engineers and clinician-scientists. In particular, we have included spectrogram visualization, which is often used in clinical review of EEG, and a built-in anonymizer to remove identifiable information from the EDF files. EPViz has helped our own team to build an interdisciplinary and inter-institutional collaboration in epilepsy. We hope that it will promote such collaborations for other researchers.